Compositions and methods for the detection and treatment of cervical cancer and cervical intraepithelial neoplasia

ABSTRACT

Described herein are biomarkers for HPV-associated pre-cancers and cancers such as cervical cancer and cervical intraepithelial neoplasia. The RNA binding protein (RBP) and long-noncoding RNA (lnc-RNA) biomarkers can be detected and used to diagnose HPV-associated pre-cancers and cancers. In addition, early diagnosis of HPV-associated pre-cancers and cancers can facilitate therapeutic intervention in patients, particularly in the pre-cancer stage which can delay or prevent progression to cancer.

CROSS-REFERENCE TO RELATED APPLICATIONS

This application is a continuation of U.S. application Ser. No.15/270,774 filed on Sep. 20, 2016, which is incorporated herein byreference in its entirety.

STATEMENT REGARDING FEDERALLY SPONSORED RESEARCH & DEVELOPMENT

This invention was made in part with government support from theNational Institutes of Health. The government has certain rights in thisinvention.

FIELD OF THE DISCLOSURE

The present disclosure is related to novel polynucleotide biomarkerswhich can be detected and can be used for the diagnosis ofHPV-associated pre-cancers and HPV-associated cancers such as cervicalcancer and cervical intraepithelial neoplasia as well as methods oftreatment of HPV-associated pre-cancers and HPV-associated cancers.

BACKGROUND

High-risk HPV persistent infection leads to the development of certaintypes of cancers in the cervix, anus, and oropharynx, for example.Fifteen mucosal HPV types are identified as oncogenic or high-risk (HR)HPVs, with HPV16 and HPV18 being particularly associated with invasivecervical cancer. Cervical cancer is the second most common cancer amongwomen worldwide. Approximately 500,000 incident cases of cervical cancerand approximately 320,000 cervical cancer deaths are estimated each yearand more than 80% of the cases arise in developing countries.

There is a need for diagnostic markers that can be detected and used forearly diagnosis of high-risk HPV infection, HPV-associated pre-cancerand HPV-associated cancer and for the development of interventionstrategies for treatment of HPV-induced cancers.

SUMMARY

In one aspect, a method of determining if a test patient has stage 1,stage 2, or stage 3 cervical intraepithelial neoplasia or cervicalcancer comprises

-   -   determining an expression level of a first polynucleotide        biomarker in a sample containing cells from the test patient's        cervix with one or more first polynucleotides that hybridizes to        the first polynucleotide biomarker, wherein the first        polynucleotide biomarker is lnc-FANCI-2, lnc-GLB1L2-1, GRB7 (SEQ        ID NOs: 8-11 and 94), NOVA1 (SEQ ID NOs: 14, 15 and 95),        RNASEH2A (SEQ ID NO: 19), CDKN2A (SEQ ID NOs: 1-4), ELAVL2 (SEQ        ID NOs: 5-7), HSPB1 (SEQ ID NO: 12), KHSRP (SEQ ID NO: 13),        PTBP1 (SEQ ID NOs: 16-18), or a combination thereof,    -   correlating the expression level of the first polynucleotide        biomarker in the sample containing cells from the test patient's        cervix to a reference expression level of the first        polynucleotide biomarker in a reference sample, wherein the        reference sample is        -   a control sample from a patient or patients with no evidence            of cervical cancer,        -   a control sample from a cervical cancer patient or patients,            or        -   a control sample from a patient or patients with stage 1,            stage 2, or stage 3 cervical intraepithelial neoplasia, and    -   determining, based on said correlation, if the test patient has        cervical cancer, or stage 1, stage 2, or stage 3 cervical        intraepithelial neoplasia.

In another aspect, the method of determining if a test patient has stage1, stage 2, or stage 3 cervical intraepithelial neoplasia or cervicalcancer comprises

-   -   determining an expression level of a first polynucleotide        biomarker in a sample containing cells from the test patient's        cervix with one or more first polynucleotides that hybridizes to        the first polynucleotide biomarker, wherein the first        polynucleotide biomarker is GRB7 (SEQ ID NOs: 8-11 and 84),        NOVA1 (SEQ ID NOs: 14, 15 and 95), RNASEH2A (SEQ ID NO: 19), or        a combination thereof, and/or    -   determining an expression level of a second polynucleotide        biomarker in the sample containing cells from the test patient's        cervix with one or more second polynucleotides that hybridizes        to the second polynucleotide biomarker, wherein the second        polynucleotide biomarker is lnc-FANCI-2, lnc-GLB1L2-1, or a        combination thereof.

In a further aspect, a method of quantitating an expression level of afirst polynucleotide biomarker in a sample containing cells from a testpatient's cervix with one or more first polynucleotides that hybridizesto the first polynucleotide biomarker comprises

-   -   contacting the sample containing cells from test patient's        cervix with the one or more first polynucleotides, and    -   detecting the level of hybridization of the one or more first        polynucleotides to the first polynucleotide biomarker,    -   wherein the first polynucleotide biomarker is lnc-FANCI-2,        lnc-GLB1L2-1, GRB7 (SEQ ID NOs: 8-11 and 94), NOVA1 (SEQ ID NOs:        14, 15 and 95), RNASEH2A (SEQ ID NO: 19), CDKN2A (SEQ ID NOs:        1-4), ELAVL2 (SEQ ID NOs: 5-7), HSPB1 (SEQ ID NO: 12), KHSRP        (SEQ ID NO: 13), PTBP1 (SEQ ID NOs: 16-18), or a combination        thereof.

In a yet further aspect, a method of treating a test patient in need oftreatment for stage 1, stage 2, or stage 3 cervical intraepithelialneoplasia or cervical cancer comprises

-   -   determining an expression level of a first polynucleotide        biomarker in a sample containing cells from the test patient's        cervix with one or more first polynucleotides that hybridizes to        the first polynucleotide biomarker, wherein the first        polynucleotide biomarker is lnc-FANCI-2, lnc-GLB1L2-1, GRB7 (SEQ        ID NOs: 8-11 and 94), NOVA1 (SEQ ID NOs: 14, 15 and 95),        RNASEH2A (SEQ ID NO: 19), CDKN2A (SEQ ID NOs: 1-4), ELAVL2 (SEQ        ID NOs: 5-7), HSPB1 (SEQ ID NO: 12), KHSRP (SEQ ID NO: 13),        PTBP1 (SEQ ID NOs: 16-18), or a combination thereof,    -   correlating the expression level of the first polynucleotide        biomarker in the sample containing cells from the test patient's        cervix to a reference expression level of the first        polynucleotide biomarker in a reference sample, wherein the        reference sample is        -   a control sample from a patient or patients with no evidence            of cervical cancer,        -   a control sample from a cervical cancer patient or patients,            or        -   a control sample from a patient or patients with stage 1,            stage 2, or stage 3 cervical intraepithelial neoplasia, and    -   administering a therapeutic intervention for the treatment of        stage 1, stage 2, or stage 3 cervical intraepithelial neoplasia,        or cervical cancer when it is determined, based on said        expression levels, that the test patient has stage 1, stage 2,        or stage 3 cervical intraepithelial neoplasia.

In a still further aspect, a method of determining if a test patient hasan HPV-associated pre-cancer or an HPV-associated cancer comprises

-   -   determining an expression level of a first polynucleotide        biomarker in a sample containing cells from a tissue of the test        patient with one or more first polynucleotides that hybridizes        to the first polynucleotide biomarker,    -   correlating the expression level of the first polynucleotide        biomarker in the sample containing cells from the tissue of the        test patient to a reference expression level of the first        polynucleotide biomarker in a reference sample, wherein the        reference sample is        -   a control sample from a patient or patients with no evidence            of HPV-associated pre-cancer or HPV-associated cancer,        -   a control sample from a patient or patients with            HPV-associated pre-cancer, or        -   a control sample from a patient or patients with            HPV-associated cancer, and    -   determining, based on said correlation, if the test patient has        HPV-associated pre-cancer or HPV-associated cancer,    -   wherein the first polynucleotide biomarker is lnc-FANCI-2,        lnc-GLB1L2-1, GRB7 (SEQ ID NOs: 8-11 and 94), NOVA1 (SEQ ID NOs:        14, 15 and 95), RNASEH2A (SEQ ID NO: 19), CDKN2A (SEQ ID NOs:        1-4), ELAVL2 (SEQ ID NOs: 5-7), HSPB1 (SEQ ID NO: 12), KHSRP        (SEQ ID NO: 13), PTBP1 (SEQ ID NOs: 16-18), or a combination        thereof.

In another aspect, a method of quantitating an expression level of afirst polynucleotide biomarker in a sample containing cells from atissue of the test patient with one or more first polynucleotides thathybridizes to the first polynucleotide biomarker comprises

-   -   contacting the sample containing cells from a tissue of the test        patient with the one or more first polynucleotides, and    -   detecting the level of hybridization of the one or more first        polynucleotides to the first polynucleotide biomarker,    -   wherein the first polynucleotide biomarker lnc-FANCI-2,        lnc-GLB1L2-1, is GRB7 (SEQ ID NOs: 8-11 and 94), NOVA1 (SEQ ID        NOs: 14, 15 and 95), RNASEH2A (SEQ ID NO: 19), CDKN2A (SEQ ID        NOs: 1-4), ELAVL2 (SEQ ID NOs: 5-7), HSPB1 (SEQ ID NO: 12),        KHSRP (SEQ ID NO: 13), PTBP1 (SEQ ID NOs: 16-18), or a        combination thereof.

In a yet further aspect, a method of treating a test patient in need oftreatment for an HPV-associated pre-cancer or an HPV-associated cancercomprises

-   -   determining an expression level of a first polynucleotide        biomarker in a sample containing cells from a tissue of the test        patient with one or more first polynucleotides that hybridizes        to the first polynucleotide biomarker,    -   correlating the expression level of the first polynucleotide        biomarker in the sample containing cells from the tissue of the        test patient to a reference expression level of the first        polynucleotide biomarker in a reference sample, wherein the        reference sample is        -   a control sample from a patient or patients with no evidence            of HPV-associated pre-cancer or HPV-associated cancer,        -   a control sample from a patient or patients with            HPV-associated pre-cancer, or        -   a control sample from a patient or patients with            HPV-associated cancer, and    -   administering a therapeutic intervention for the treatment of        HPV-associated pre-cancer or HPV-associated cancer when it is        determined, based on said expression levels, that the test        patient has HPV-associated pre-cancer or an HPV-associated        cancer, wherein the first polynucleotide biomarker is        lnc-FANCI-2, lnc-GLB1L2-1, GRB7 (SEQ ID NOs: 8-11 and 94), NOVA1        (SEQ ID NOs: 14, 15 and 95), RNASEH2A (SEQ ID NO: 19), CDKN2A        (SEQ ID NOs: 1-4), ELAVL2 (SEQ ID NOs: 5-7), HSPB1 (SEQ ID NO:        12), KHSRP (SEQ ID NO: 13), PTBP1 (SEQ ID NOs: 16-18), or a        combination thereof.

BRIEF DESCRIPTION OF THE DRAWINGS

The patent or application file contains at least one drawing executed incolor. Copies of this patent or patent application publication withcolor drawings will be provided by the Office upon receipt and paymentof the necessary fee.

FIG. 1 is a flowchart of the RNA-sequencing (RNA-Seq) analyses forRNA-binding proteins (RBPs).

FIG. 2 shows Venn diagrams showing 95 differentially expressed RBP genesbeing identified from two separate RNA-seq analyses of cervical cancer,pre-cancer to normal cervical tissues.

FIG. 3 shows a heat map comparing 95 differentially expressed RBP genesin cervical cancer to normal cervical tissues.

FIG. 4 shows the TaqMan® RT-qPCR validation of the 8 selected RBPs.

FIG. 5 shows that high-risk HPV16 infection affects the expression ofRBPs. Total RNA extracted from human vaginal keratinocyte (HVK)-derivedraft cultures with (HVK16) or without (HVK) productive HPV16 infectionand human foreskin keratinocyte (HFK) derived raft cultures with (HFK16)or without (HFK) productive HPV16 infection were examined by TaqMan®RT-qPCR for the expression of 8 RBPs.

FIG. 6 shows that high-risk HPV18 infection affects the expression ofRBPs. Total RNA extracted from human vaginal keratinocyte (HVK)-derivedraft cultures with (HVK18) or without (HVK) productive HPV18 infectionand human foreskin keratinocyte (HFK) derived raft cultures with (HFK18)or without (HFK) productive HPV18 infection were examined by TaqMan®RT-qPCR for the expression of 8 RBPs.

FIG. 7 shows that both HPV16 and HPV18 increase the expression of CDKN2Aand RNASEH2A, but decrease the expression of NOVA1 in HFK- andHVK-derived rafts.

FIG. 8 shows that HPV18 infection and viral E6 and/or E7 affect theexpression of RNASEH2A and Nova1. The expression of RNASEH2A and NOVA1in primary human keratinocytes (PHK)-derived raft tissues with orwithout HPV18 infection on day 8, day 12, and day 16 or PHK raftstransduced with a retrovirus expression HPV18 E6, E7 or E6E7 or with anempty control retrovirus were further validated by TaqMan® RT-qPCR.

FIG. 9 shows that knockdown or overexpression of RNASEH2A in HeLa orCaSki cells affects cell proliferation. Specific-siRNA knockdown orectopic expression of RNASEH2A from a mammalian expression vector inHeLa or CaSki cells on cell proliferation was evaluated by Cell CountingKit-8 (CCK-8) assay

FIG. 10 shows HPV oncoprotein E7 regulates the expression of RNASEH2Avia E2F1. Specific-siRNA knockdown or ectopic expression of E2F1 from amammalian expression vector in HeLa or CaSki cells on RNASEH2A wasevaluated by Western blot.

FIG. 11 is a flowchart of the RNA-Seq analyses for long-noncoding RNAs(lnc-RNAs).

FIG. 12 is a heat map showing 209 overlapped, differentially expressedlnc-RNAs from cervical cancer, pre-cancer to normal cervical tissues.

FIG. 13 shows an increase of lnc-FANCI-2, and decrease of lnc-GLB1L2-1expression along with the cervical lesion progression from normalcervix. Lnc-FANCI-2 and lnc-GLB1L2-1 RNA expression was examined byRT-qPCR in 24 normal, 25 CIN 2-3, and 23 cancer tissues.

FIG. 14 shows that HPV infection increases lnc-FANCI-2 expression inHVK- and PHK-derived rafts and viral E7 or E6 is responsible for theincrease. The expression of lnc-FANCI-2 in human vaginal keratinocytes(HVK)-derived raft tissues without (HVK) or with HPV16 (HVK16) or HPV18(HVK18) infection or primary human keratinocytes (PHK)-derived rafttissues without or with HPV18 infection.

The above-described and other features will be appreciated andunderstood by those skilled in the art from the following detaileddescription, drawings, and appended claims.

DETAILED DESCRIPTION

Using an RNA-sequencing (RNA-Seq) approach, the inventors of the presentapplication examined seven normal cervical tissues and seven cervicalcancer tissues for their expression landscapes of approximately 19,000coding and 113,513 noncoding RNAs. 614 differentially expressed codingtranscripts enriched in cancer related pathways were identified, with 95of them encoding RNA-binding proteins (RBPs) from the analyzed 1502human RBPs. Moreover, 209 differentially, abundantly expressedlong-noncoding RNAs (lnc-RNAs) from normal cervix to cervical cancerwere identified. Validation of the altered expression of 26 candidates,including 8 RBP genes by using TaqMan® real-time PCR in a cohort of 47human cervical tissue samples, including 24 normal cervical tissues and23 cervical cancer tissues, showed that they are broadly involved incervical carcinogenesis. Many of the identified RBP candidates had notbeen previously reported. Using human vaginal keratinocyte-derived raftculture tissues with or without HPV16 and HPV18 infection, it wasfurther corroborated that these RBP candidates, including CDKN2A,ELAVL2, GRB7, HSPB1, KHSRP, PTBP1, RNASEH2A, and NOVA1, are regulated byHPV infection. Further, the inventors found that lnc-FANCI-2 wasincreasingly expressed along with cervical lesion progression fromcervical intraepithelial neoplasia (CIN) to cervical cancer, whencompared to the normal tissues. In contrast, lncGLB1L2-1 was graduallydecreased along with the lesion progression, when compared to the normaltissues. In addition, FAM83A, SEMA3F, CLDN10, ASRGL1, which are notRBPs, were also found to have altered expression in cervical cancercompared to normal tissue, with FAM83A and SEMA3F being increased incervical cancer and CLDN10 and ASRGL1 being decreased in cervicalcancer. The results presented herein provide the first comprehensiveexpression atlas of RBPs and lnc-RNAs in normal cervix and cervicalcancer, which can be detected to provide better diagnosis and treatmentof patients with cervical cancer.

More specifically, an increase of lnc-FANCI-2 RNA, including all of its35 isoforms, and a decrease of lnc-GLB1L2-1, including its 21 isoforms,were identified in cervical cancer. Fanconi anemia (FA) frequentlydevelops squamous cell carcinoma at sites that are associated withHPV-driven cancer including the female reproductive tract, and is causedby mutations in one of 15 genes in the FA pathway (including FANCA,FANCD2, and FANCI). Loss of FA pathway components FANCA and FANCD2stimulates E7 protein accumulation in human keratinocytes, and loss ofFANCD2 stimulates HPV DNA replication. Both FANCI and lnc-FANCI-2 areexpressed from the same location at chromosome 15q26.1. Further, bothGLB1L2 (galactosidase, beta 1-like 2) and lnc-GLB1L2-1 are expressedfrom Chromosome 11q25, with unknown function in cancer development. Byusing TaqMan® qRT-PCR validation of lnc-FANCI-2 and lnc-GLB1L2-1 in 24normal, 25 CIN 2-3, and 23 cervical cancer tissues, it was confirmedthat altered expression of these lnc-RNAs is remarkably related tocervical lesion progression from CIN to cancer. Moreover, the alteredchanges of lnc-FANCI-2 could be attributed to HPV16 and HPV18 infectionin raft cultures and viral E7 expression. These lnc-RNAs are biomarkersfor early diagnosis of high-risk HPV infection with high risk ofprogression and for development of intervention strategies to treatHPV-induced cancers.

As used herein, a non-coding RNA (ncRNA) is an RNA transcript that doesnot encode a protein. ncRNAs include short ncRNAs and long ncRNAs(lnc-RNAs). Short ncRNAs are ncRNAs that are generally 18-200nucleotides (nt) in length. Examples of short ncRNAs include, but arenot limited to, microRNAs (miRNAs), piwi-associated RNAs (piRNAs), shortinterfering RNAs (siRNAs), promoter-associated short RNAs (PASRs),transcription initiation RNAs (tiRNAs), termini-associated short RNAs(TASRs), antisense termini associated short RNAs (aTASRs), smallnucleolar RNAs (snoRNAs), transcription start site antisense RNAs(TSSa-RNAs), small nuclear RNAs (snRNAs), retroposon-derived RNAs(RE-RNAs), 3′UTR-derived RNAs (uaRNAs), x-ncRNA, human Y RNA (hY RNA),unusually small RNAs (usRNAs), small NF90-associated RNAs (snaRs), vaultRNAs (vtRNAs), small Cajal body-specific RNAs (scaRNAs), and telomerespecific small RNAs (tel-sRNAs). lnc-RNAs are cellular RNAs, exclusiveof rRNAs, greater than 200 nucleotides in length and having no obviousprotein-coding capacity. Lnc-RNAs include, but are not limited to, largeor long intergenic ncRNAs (lincRNAs), transcribed ultraconserved regions(T-UCRs), pseudogenes, GAA-repeat containing RNAs (GRC-RNAs), longintronic ncRNAs, antisense RNAs (aRNAs), promoter-associated long RNAs(PALRs), promoter upstream transcripts (PROMPTs), and longstress-induced non-coding transcripts (LSINCTs).

An RNA-binding protein is a protein that binds single or double strandedRNA to form ribonucleoprotein complexes. RBPs contain conservedstructural motifs such as the RNA recognition motif (RRM), dsRNA bindingdomain, zinc finger domain, and others.

The biomarkers for detection and diagnosis of CIN and cervical cancerinclude the RBP and lnc-RNA biomarkers of Tables 1-3:

TABLE 1 RBP biomarkers SEQ ID NO: chr start end refseqID Symboldescription 1 chr9 21967750 21975132 NM_000077 CDKN2A cyclin-dependentkinase inhibitor 2A (CDKN2A), transcript variant 1, mRNA. 2 chr921967750 21975132 NM_001195132 CDKN2A Homo sapiens cyclin-dependentkinase inhibitor 2A (CDKN2A), transcript variant 5, mRNA. 3 chr921967750 21994490 NM_058195 CDKN2A cyclin-dependent kinase inhibitor 2A(CDKN2A), transcript variant 4, mRNA. 4 chr9 21967750 21974826 NM_058197CDKN2A cyclin-dependent kinase inhibitor 2A (CDKN2A), transcript variant3, mRNA. 5 chr9 23690102 23821843 NM_001171195 ELAVL2 Homo sapiens ELAV(embryonic lethal, abnormal vision, Drosophila)-like 2 (Hu antigen B)(ELAVL2), transcript variant 2, mRNA. 6 chr9 23690102 23821478NM_001171197 ELAVL2 Homo sapiens ELAV (embryonic lethal, abnormalvision, Drosophila)-like 2 (Hu antigen B) (ELAVL2), transcript variant3, mRNA. 7 chr9 23690102 23826063 NM_004432 ELAVL2 ELAV (embryoniclethal, abnormal vision, Drosophila)-like 2 (Hu antigen B) (ELAVL2),transcript variant 1, mRNA. 8 chr17 37894575 37903538 NM_001030002 GRB7growth factor receptor-bound protein 7 (GRB7), transcript variant 2,mRNA. 9 chr17 37895023 37903538 NM_001242442 GRB7 Homo sapiens growthfactor receptor-bound protein 7 (GRB7), transcript variant 4, mRNA. 10chr17 37896219 37903538 NM_001242443 GRB7 Homo sapiens growth factorreceptor-bound protein 7 (GRB7), transcript variant 3, mRNA. 11 chr1737894161 37903538 NM_005310 GRB7 growth factor receptor-bound protein 7(GRB7), transcript variant 1, mRNA. 94 chr17 NM_001330207.1 GRB7 growthfactor receptor-bound protein 7 (GRB7), transcript variant 5, mRNA. 12chr7 75931874 75933614 NM_001540 HSPB1 heat shock 27 kDa protein 1(HSPB1), mRNA. 13 chr19 6413118 6424822 NM_003685 KHSRP KH-type splicingregulatory protein (KHSRP), mRNA. 14 chr14 26915088 27066960 NM_002515NOVA1 neuro-oncological ventral antigen 1 (NOVA1), transcript variant 1,mRNA. 15 chr14 26915088 27066960 NM_006489 NOVA1 neuro-oncologicalventral antigen 1 (NOVA1), transcript variant 2, mRNA. 95 chr14NM_006491.2 NOVA1 neuro-oncological ventral antigen 1 (NOVA1),transcript variant 3, mRNA. 16 chr19 797391 812327 NM_002819 PTBP1polypyrimidine tract binding protein 1 (PTBP1), transcript variant 1,mRNA. 17 chr19 797391 812327 NM_031990 PTBP1 polypyrimidine tractbinding protein 1 (PTBP1), transcript variant 2, mRNA. 18 chr19 797391812327 NM_031991 PTBP1 polypyrimidine tract binding protein 1 (PTBP1),transcript variant 3, mRNA. 19 chr19 12917427 12924462 NM_006397RNASEH2A ribonuclease H2, subunit A (RNASEH2A), mRNA.

TABLE 2 lnc-FANCI-2 isoforms Transcript ID SEQ ID NO: Location (hg19)Length lnc-FANCI-2:1 20 chr15: 89904810-89938553 1613 lnc-FANCI-2:10 21chr15: 89921280-89938544 606 lnc-FANCI-2:11 22 chr15: 89921331-89938354551 lnc-FANCI-2:12 23 chr15: 89921347-89939471 1877 lnc-FANCI-2:13 24chr15: 89921362-89938500 561 lnc-FANCI-2:14 25 chr15: 89921794-89931745786 lnc-FANCI-2:15 26 chr15: 89922355-89938350 569 lnc-FANCI-2:16 27chr15: 89922468-89941720 3779 lnc-FANCI-2:17 28 chr15: 89922495-899417193670 lnc-FANCI-2:18 29 chr15: 89923111-89941720 3784 lnc-FANCI-2:19 30chr15: 89925731-89938271 779 lnc-FANCI-2:2 31 chr15: 89904810-899385511611 lnc-FANCI-2:20 32 chr15: 89929827-89939471 2718 lnc-FANCI-2:21 33chr15: 89930671-89941720 3723 lnc-FANCI-2:22 34 chr15: 89904810-899417184778 lnc-FANCI-2:23 35 chr15: 89911330-89941718 4113 lnc-FANCI-2:24 36chr15: 89911399-89941721 3936 lnc-FANCI-2:25 37 chr15: 89912393-899416834026 lnc-FANCI-2:26 38 chr15: 89921102-89941708 4334 lnc-FANCI-2:27 39chr15: 89921273-89941718 3868 lnc-FANCI-2:28 40 chr15: 89922232-899416833978 lnc-FANCI-2:29 41 chr15: 89923021-89941683 3837 lnc-FANCI-2:3 42chr15: 89905705-89922463 571 lnc-FANCI-2:30 43 chr15: 89929880-899417214915 lnc-FANCI-2:31 44 chr15: 89930027-89941721 4687 lnc-FANCI-2:32 45chr15: 89930389-89931372 706 lnc-FANCI-2:33 46 chr15: 89930557-899416833922 lnc-FANCI-2:34 47 chr15: 89931724-89941721 3690 lnc-FANCI-2:35 48chr15: 89932071-89941708 4093 lnc-FANCI-2:4 49 chr15: 89905718-89938562957 lnc-FANCI-2:5 50 chr15: 89911330-89941718 2124 lnc-FANCI-2:6 51chr15: 89912386-89931074 576 lnc-FANCI-2:7 52 chr15: 89918593-899417206547 lnc-FANCI-2:8 53 chr15: 89921220-89941692 3814 lnc-FANCI-2:9 54chr15: 89921273-89941718 4198

TABLE 3 lnc-GLB1L2-1 isoforms SEQ ID Transcript ID NO: Location (hg19)Length lnc-GLB1L2-1:1 55 chr11: 134306367-134337169 1402 bplnc-GLB1L2-1:10 56 chr11: 134350719-134372941  295 bp lnc-GLB1L2-1:11 57chr11: 134352524-134373110  374 bp lnc-GLB1L2-1:12 58 chr11:134306376-134375555 2737 bp lnc-GLB1L2-1:13 59 chr11:134339378-134360125 15706 bp  lnc-GLB1L2-1:14 60 chr11:134339400-134373384  744 bp lnc-GLB1L2-1:15 61 chr11:134339400-134375553 1129 bp lnc-GLB1L2-1:16 62 chr11:134343291-134373078 1843 bp lnc-GLB1L2-1:17 63 chr11:134344051-134375009 1160 bp lnc-GLB1L2-1:18 64 chr11:134346572-134375009  572 bp lnc-GLB1L2-1:19 65 chr11:134349193-134375555 4435 bp lnc-GLB1L2-1:2 66 chr11: 134306469-134308558 374 bp lnc-GLB1L2-1:20 67 chr11: 134349983-134375009 1245 bplnc-GLB1L2-1:21 68 chr11: 134350411-134401542  537 bp lnc-GLB1L2-1:3 69chr11: 134306629-134374934 1863 bp lnc-GLB1L2-1:4 70 chr11:134336079-134357809 3679 bp lnc-GLB1L2-1:5 71 chr11: 134336079-1343578093620 bp lnc-GLB1L2-1:6 72 chr11: 134344060-134350796  720 bplnc-GLB1L2-1:7 73 chr11: 134349193-134375507 4387 bp lnc-GLB1L2-1:8 74chr11: 134349731-134352843 1398 bp lnc-GLB1L2-1:9 75 chr11:134350086-134367700  939 bp

In additional aspects, the biomarker includes FAM83A (SEQ ID NO: 86;KJ895067.1), SEMA3F (SEQ ID NOs: 87-89; NM_004186.4; NM_001318800.1;NM_001318798.1), CLDN10 (SEQ ID NO: 90-91; NM_182848.3; NM_006984.4),ASRGL1 (SEQ ID NO: 92, 93; NM_001083926.1; NM_025080.3), or acombination thereof.

An RBP, lnc-RNA, or additional RNA biomarker is differentially expressedbetween two samples if the amount of the RBP, lnc-RNA, or additional RNAbiomarker in one sample is statistically significantly different fromthe amount of the RBP, lnc-RNA, or additional RNA biomarker in the othersample. The expression level of an RBP, lnc-RNA, or additional RNAbiomarker can be increased or decreased in a test sample relative to areference sample. For example, an RBP gene, lnc-RNA, or additional RNAbiomarker is differentially expressed in two samples if it is present atleast about 120%, at least about 130%, at least about 150%, at leastabout 180%, at least about 200%, at least about 300%, at least about500%, at least about 700%, at least about 900%, or at least about 1000%greater than it is present in the other sample, or if it is detectablein one sample and not detectable in the other.

Alternatively or additionally, an RBP gene, lnc-RNA, or additional RNAbiomarker is differentially expressed in two sets of samples if thefrequency of detecting the RBP gene, lnc-RNA, or additional RNAbiomarker in samples is statistically significantly higher or lower thanin the control samples. For example, an RBP gene, lnc-RNA, or additionalRNA biomarker is differentially expressed in two sets of samples if itis detected at least about 120%, at least about 130%, at least about150%, at least about 180%, at least about 200%, at least about 300%, atleast about 500%, at least about 700%, at least about 900%, or at leastabout 1000% more frequently or less frequently observed in one set ofsamples than the other set of samples.

A test amount and a control amount of a biomarker can be either anabsolute amount (e.g., number of copies/ml, nanogram/ml or microgram/ml)or a relative amount (e.g., relative intensity of signals).

Diagnostic samples for use in the methods described herein comprisenucleic acids suitable for providing polynucleotide, e.g., RNA,expression information. The sample contains cells from a tissue of thetest patient. For example, when the HPV-associated pre-cancer orHPV-associated cancer is anal cancer, the tissue of the test patientcontains anal cells; when the HPV-associated pre-cancer orHPV-associated cancer is vulvovaginal cancer, the tissue of the testpatient contains vulvovaginal cells; when the HPV-associated pre-canceror HPV-associated cancer is penile cancer, the tissue of the testpatient contains penal cells; or when the HPV-associated pre-cancer orHPV-associated cancer is oropharyngeal cancer, the tissue of the testpatient contains oropharyngeal cells.

In one aspect, samples for the methods disclosed herein contain cellsfrom a patient's cervix. Exemplary test samples include a PAP smear, avaginal wash, or a cervical biopsy sample. In certain aspects, themethods described herein include obtaining from the test patient thesample containing cells from the test patient's cervix.

In certain aspects, the test patient is a patient at risk for anHPV-associated pre-cancer or an HPV-associated cancer, such as a patientdiagnosed with HPV infection or a patient at high risk for HPVinfection.

In certain aspects, the test patient is a patient at high risk forcervical cancer such as a woman at high risk for HPV infection, a womanwith a diagnosed HPV infection, a woman with a history of DES exposure,a woman with a previous history of gynecological cancer, a woman with anabnormal PAP test, a woman immunosuppressed due to AIDS or therapyfollowing organ transplantation, or a woman with abnormal endometrialcells.

In certain aspects, the methods disclosed herein comprise detecting theexpression level of one or more biomarkers as disclosed herein.

In addition, the methods disclosed herein include thecomparison/correlation of the expression levels of biomarkers in thediagnostic sample from the test patient to a reference sample. Exemplaryreference samples include a control sample from a patient or patientswith no evidence of HPV-associated pre-cancer or HPV-associated cancer,a control sample from a patient or patients with HPV-associatedpre-cancer, and a control sample from a patient or patients withHPV-associated cancer. Additional exemplary reference samples include acontrol sample from a patient or patients with no evidence of cervicalcancer, a control sample from a cervical cancer patient or patients, ora control sample from a patient or patients with stage 1, stage 2, orstage 3 cervical intraepithelial neoplasia. The reference sample can bea single sample from a control patient with a known disease state, orpreferably samples from a plurality of subjects such that the referenceexpression level is averaged over the expression levels for a populationof known disease state. Useful population sizes for a referencepopulation are greater than 100 subjects, specifically about 500subjects for each reference group (CIN 1, 2, 3 and cervical cancer), forexample.

RNA can be extracted and purified from biological samples using suitabletechniques that are known in the art, and several are commerciallyavailable (e.g., FormaPure® nucleic acid extraction kit, Agencourt®Biosciences, Beverly Mass., High Pure FFPE RNA Micro Kit, Roche AppliedScience, Indianapolis, Ind.). RNA can be extracted from frozen tissuesections using TRIzol® (Invitrogen, Carlsbad, Calif.) and purified usingRNeasy® Protect kit (Qiagen, Valencia, Calif.). RNA can be furtherpurified using DNase I treatment (Ambion, Austin, Tex.) to eliminate anycontaminating DNA. RNA concentrations can be made using a NanoDropND-1000 spectrophotometer (NanoDdrop Technologies, Rockland, Del.). RNAcan be further purified to eliminate contaminants that interfere withcDNA synthesis by cold sodium acetate precipitation. RNA integrity canbe evaluated by running electropherograms, and RNA integrity number(RIN, a correlative measure that indicates intactness of mRNA) can bedetermined using the RNA 6000 PicoAssay for the Bioanalyzer 2100(Agilent Technologies, Santa Clara, Calif.).

Following sample collection and nucleic acid extraction, the nucleicacid portion of the sample comprising RNA that is or can be used toprepare the target polynucleotide(s) of interest can be subjected to oneor more preparative reactions. These preparative reactions can includein vitro transcription (IVT), labeling, fragmentation, amplification,and other reactions. mRNA can first be treated with reversetranscriptase and a primer to create cDNA prior to detection,quantitation, or amplification; this can be done in vitro with purifiedmRNA or in situ, e.g., in cells or tissues affixed to a slide.

By “amplification” is meant a process of producing at least one copy ofa nucleic acid, in this case an expressed RNA, and in many casesproduces multiple copies. An amplification product can be RNA or DNA,and may include a complementary strand to the expressed target sequence.DNA amplification products can be produced initially through reversetranscription and then optionally from further amplification reactions.The amplification product may include all or a portion of a targetsequence, and may optionally be labeled. A variety of amplificationmethods are suitable for use, including polymerase-based methods andligation-based methods.

The expression level of a polynucleotide biomarker can be determined byreverse transcriptase-polymerase chain reaction (RT-PCR) methods,quantitative real-time RT-PCR (RT-qPCR), microarray, serial analysis ofgene expression (SAGE), next-generation RNA sequencing (deepsequencing), gene expression analysis by massively parallel signaturesequencing (MPSS), immunoassays such as ELISA, in situ hybridization(ISH) formulations that allow histopathological analysis, massspectrometry (MS) methods, transcriptomics, RNA pull-down and chromatinisolation by RNA purification (ChiRP), proteomics-based identificationof lncRNA, detection of single nucleotide polymorphisms (SNPs),measurement of DNA methylation or unmethylation, measurement of siRNAsilencing or miRNA silencing, or measurement of downstream targets.

As used herein, the terms “quantitative real time polymerase chainreaction,” “real-time polymerase chain reaction,” and “qPCR” aresynonymous and refer to a laboratory technique based on a polymerasechain reaction used to amplify and simultaneously quantify a targetedDNA molecule. Frequently, real-time PCR is combined with reversetranscription to quantify messenger RNA and non-coding RNA in cells ortissues, e.g., RT-qPCR.

Additional methods for detecting and/or quantifying a polynucleotidebiomarker can comprise single-molecule sequencing (e.g., Illumina®,PacBio, ABI SOLID™), in situ hybridization, bead-array technologies(e.g., Luminex xMAP®, Illumina® BeadChips), branched DNA technology(e.g., Affymetrix®, Genisphere®), and Ion Torrent™. In some instances,methods for detecting and/or quantifying a target sequence comprisetranscriptome sequencing techniques. Transcription sequencing (e.g.,RNA-seq, “Whole Transcriptome Shotgun Sequencing” (WTSS)) may comprisethe use of high-throughput sequencing technologies to sequence cDNA inorder to get information about a sample's RNA content. Transcriptomesequencing can provide information on differential expression of genes,including gene alleles and differently spliced transcripts, non-codingRNAs, post-transcriptional mutations or editing, and gene fusions.

Included herein is a method for measuring the expression levels ofbiomarkers for HPV-associated pre-cancers and cancers as describedherein. The methods optionally include identifying HPV-associatedpre-cancer or cancer status of a test subject (e.g., cervical cancer).The data obtained from the expression profiles of a population (e.g.,normal, CIN1-3, or cervical cancer) can be evaluated using one or morepattern recognition algorithms. In addition, the results of imagingtests or histological evaluation may optionally be combined withexpression profiles generated using the genes disclosed herein.

In one aspect, the methods include

-   -   comparing (correlating) the expression level of the first        polynucleotide biomarker in the sample containing cells from a        tissue of the test patient to a reference expression level of        the first polynucleotide biomarker in a reference sample,        wherein the reference sample is    -   a control sample from a patient or patients with no evidence of        HPV-associated pre-cancer or HPV-associated cancer,    -   a control sample from a patient or patients with HPV-associated        pre-cancer, or    -   a control sample from a patient or patients with HPV-associated        cancer, and    -   determining, based on said correlation, if the test patient has        HPV-associated pre-cancer or HPV-associated cancer

In another aspect, the methods comprise

-   -   predicting (or determining), based on the expression level of        one or more polynucleotide biomarkers in the containing cells        from a tissue of the test patient and a reference expression        level of the one or more polynucleotide biomarkers in a        reference sample that the patient has no HPV-associated        pre-cancer or cancer, that the test patient has HPV-associated        pre-cancer, or that the patient has HPV-associated cancer,        wherein the reference sample is    -   a control sample from a patient or patients with no evidence of        HPV-associated pre-cancer or HPV-associated cancer,    -   a control sample from a patient or patients with HPV-associated        pre-cancer, or    -   a control sample from a patient or patients with HPV-associated        cancer.

In a further aspect, the methods include

-   -   classifying the patient as having no cervical cancer or cervical        intraepithelial neoplasia, or as having HPV-associated        pre-cancer or cancer based on the expression level of one or        more polynucleotide biomarkers in the sample containing cells        from a tissue of the test patient and a reference expression        level of the one or more polynucleotide biomarkers in a        reference sample, wherein the reference sample is    -   a control sample from a patient or patients with no evidence of        HPV-associated pre-cancer or HPV-associated cancer,    -   a control sample from a patient or patients with HPV-associated        pre-cancer, or a control sample from a patient or patients with        HPV-associated cancer.

In one aspect, the methods include

-   -   comparing (or correlating) the expression level of one or more        polynucleotide biomarkers in the sample containing cells from        the test patient's cervix to a reference expression level of the        one or more polynucleotide biomarkers in a reference sample,        wherein the reference sample is    -   a control sample from a patient or patients with no evidence of        cervical cancer,    -   a control sample from a cervical cancer patient or patients, or    -   a control sample from a patient or patients with stage 1, stage        2, or stage 3 cervical intraepithelial neoplasia, and    -   determining, based on said comparison, if the test patient has        cervical cancer, or stage 1, stage 2, or stage 3 cervical        intraepithelial neoplasia.

In another aspect, the methods comprise

-   -   predicting (or determining), based on the expression level of        one or more polynucleotide biomarkers in the sample containing        cells from the test patient's cervix and a reference expression        level of the one or more polynucleotide biomarkers in a        reference sample that the patient has no cervical cancer or        cervical intraepithelial neoplasia, that the test patient has        cervical cancer, or that the patient has stage 1, stage 2, or        stage 3 cervical intraepithelial neoplasia, wherein the        reference sample is    -   a control sample from a patient or patients with no evidence of        cervical cancer,    -   a control sample from a cervical cancer patient or patients, or    -   a control sample from a patient or patients with stage 1, stage        2, or stage 3 cervical intraepithelial neoplasia.

In a further aspect, the methods include

-   -   classifying the patient as having no cervical cancer or cervical        intraepithelial neoplasia, as having cervical cancer, or as        having stage 1, stage 2, or stage 3 cervical intraepithelial        neoplasia based on the expression level of one or more        polynucleotide biomarkers in the sample containing cells from        the test patient's cervix and a reference expression level of        the one or more polynucleotide biomarkers in a reference sample,        wherein the reference sample is    -   a control sample from a patient or patients with no evidence of        cervical cancer,    -   a control sample from a cervical cancer patient or patients, or    -   a control sample from a patient or patients with stage 1, stage        2, or stage 3 cervical intraepithelial neoplasia.

Analysis methods may be used to form a predictive model, and then thepredictive model may be used to classify test data. For example, oneconvenient and particularly effective method of classification employsmultivariate statistical analysis modeling, first to form a model (a“predictive mathematical model”) using data (“modeling data”) fromsamples of known class (e.g., from subjects known to have, or not have,a particular grade of CIN or cervical cancer), and second to classify anunknown sample (e.g., “test data”), according to HPV-associated (e.g.,cervical) cancer status.

Pattern recognition (PR) is the use of multivariate statistics, bothparametric and non-parametric, to analyze spectroscopic data, and henceto classify samples and to predict the value of some dependent variablebased on a range of observed measurements. There are two mainapproaches. One set of methods is termed “unsupervised” and these simplyreduce data complexity in a rational way and also produce display plotswhich can be interpreted by the human eye. The other approach is termed“supervised” whereby a training set of samples with known class oroutcome is used to produce a mathematical model and is then evaluatedwith independent validation data sets.

Unsupervised PR methods are used to analyze data without reference toany other independent knowledge. Examples of unsupervised patternrecognition methods include principal component analysis (PCA),hierarchical cluster analysis (HCA), and non-linear mapping (NLM).

Alternatively, and in order to develop automatic classification methods,it has proved efficient to use a “supervised” approach to data analysis.Here, a “training set” of biomarker expression data is used to constructa statistical model that predicts correctly the “class” of each sample.This training set is then tested with independent data (referred to as atest or validation set) to determine the robustness of thecomputer-based model. These models are sometimes termed “expertsystems,” but may be based on a range of different mathematicalprocedures. Supervised methods can use a data set with reduceddimensionality (for example, the first few principal components), buttypically use unreduced data, with all dimensionality. In all cases themethods allow the quantitative description of the multivariateboundaries that characterize and separate each class, for example, eachclass of cervical cancer in terms of its biomarker expression profile.It is also possible to obtain confidence limits on any predictions, forexample, a level of probability to be placed on the goodness of fit. Therobustness of the predictive models can also be checked usingcross-validation, by leaving out selected samples from the analysis.

It is often useful to pre-process data, for example, by addressingmissing data, translation, scaling, weighting, etc. Multivariateprojection methods, such as principal component analysis (PCA) andpartial least squares analysis (PLS), are so-called scaling sensitivemethods. By using prior knowledge and experience about the type of datastudied, the quality of the data prior to multivariate modeling can beenhanced by scaling and/or weighting. Adequate scaling and/or weightingcan reveal important and interesting variation hidden within the data,and therefore make subsequent multivariate modeling more efficient.Scaling and weighting may be used to place the data in the correctmetric, based on knowledge and experience of the studied system, andtherefore reveal patterns already inherently present in the data.

The methods described herein may be implemented and/or the resultsrecorded using a device capable of implementing the methods and/orrecording the results. Examples of devices that may be used include butare not limited to electronic computational devices, including computersof all types. When the methods described herein are implemented and/orrecorded in a computer, the computer program that may be used toconfigure the computer to carry out the steps of the methods may becontained in any computer readable medium capable of containing thecomputer program. Examples of computer readable medium that may be usedinclude but are not limited to diskettes, CD-ROMs, DVDs, ROM, RAM, andother memory and computer storage devices. The computer program that maybe used to configure the computer to carry out the steps of the methodsand/or record the results may also be provided over an electronicnetwork, for example, over the internet, an intranet, or other network.

The process of comparing a measured value and a reference value can becarried out in a convenient manner appropriate to the type of measuredvalue and reference value for the discriminative gene at issue.“Measuring” can be performed using quantitative or qualitativemeasurement techniques, and the mode of comparing a measured value and areference value can vary depending on the measurement technologyemployed. For example, when a qualitative colorimetric assay is used tomeasure expression levels, the levels may be compared by visuallycomparing the intensity of the colored reaction product, or by comparingdata from densitometric or spectrometric measurements of the coloredreaction product (e.g., comparing numerical data or graphical data, suchas bar charts, derived from the measuring device). However, it isexpected that the measured values used in the methods will most commonlybe quantitative values. In other examples, measured values arequalitative. As with qualitative measurements, the comparison can bemade by inspecting the numerical data, or by inspecting representationsof the data (e.g., inspecting graphical representations such as bar orline graphs).

The process of comparing may be manual (such as visual inspection by thepractitioner of the method) or it may be automated. For example, anassay device (such as a luminometer for measuring chemiluminescentsignals) may include circuitry and software enabling it to compare ameasured value with a reference value for a biomarker. Alternately, aseparate device (e.g., a digital computer) may be used to compare themeasured value(s) and the reference value(s). Automated devices forcomparison may include stored reference values for the biomarker(s)being measured, or they may compare the measured value(s) with referencevalues that are derived from contemporaneously measured referencesamples (e.g., samples from control subjects).

As will be apparent to those of skill in the art, when replicatemeasurements are taken, the measured value that is compared with thereference value is a value that takes into account the replicatemeasurements. The replicate measurements may be taken into account byusing either the mean or median of the measured values as the “measuredvalue.”

When it has been determined that the test patient has HPV—pre-cancer orcancer, the methods optionally include HPV detection and or typing.

When it has been determined that the test patient has CIN 1, 2, or 3cervical cancer, the methods optionally include HPV detection and ortyping, for example, using the cobas® HPV test marketed by RocheDiagnostics.

Also included herein are methods of treating the test patient with aninterventional strategy for HPV-associated pre-cancer or cancer.

Interventional therapies for anal, vulvovaginal, penile, andoropharyngeal cancer include radiation therapy, surgery, andchemotherapy.

Further included herein are methods of treating the test patient with aninterventional strategy for CIN or cervical cancer. When the patient isdetermined to have stage 1 CIN, the interventional strategy may includescreening for further cervical changes, screening the patient for HPVinfection, HPV typing, or a combination thereof. Exemplary tests for thedetection of HPV infection include detection of HPV infection viaDNA/RNA amplification with PCR using, for example, the cobas® HPV testmarketed by Roche Diagnostics. Advantageously, early identification ofCIN 1 optionally coupled with determining the HPV infection type willprovide critical information regarding the type of intervention requiredto treat the patient. Early diagnosis and treatment at stage CIN 1 couldprevent or slow progression to later disease stages.

When the patient is determined to have stage 2 or stage 3 CIN,interventional strategies may include, in addition to monitoring,cryosurgery to freeze abnormal cells, laser therapy to remove abnormaltissue, loop electrosurgical procedure excision, surgery to removeabnormal tissue, or hysterectomy. At early stages, for example, low costoutpatient procedures such as loop electrosurgical excision are 90-95%effective. Thus, a benefit to the methods disclosed herein is theability to use minor surgical intervention before CIN progresses tocervical cancer.

Interventional strategies for the treatment of cervical cancer includesurgery, radiation therapy, chemotherapy, targeted therapy, or acombination thereof. Surgery involves removal of the cancer and mayinclude conization to remove tissue from the cervix and/or cervicalcanal or hysterectomy such as total, radical, modified radicalhysterectomy. Radiation therapy includes internal and external radiationtherapy in addition to intensity-modulated radiation therapy.Chemotherapy involves the use of drugs to inhibit the growth of cancercalls and can involve systemic or regional chemotherapy. Drugs approvedfor the treatment of cervical cancer include bleomycin, cisplatin,topotecan hydrochloride, and gemcitabine-cisplatin. Targeted therapyinvolves the use of drugs that identify and attack specific cancer cellswithout harming normal cells. Targeted therapy includes antibody therapysuch as bevacizumab therapy.

Further disclosed herein, is a probe set for diagnosing, predicting,and/or monitoring cervical cancer in a subject. The probe set comprisesa plurality of polynucleotide probes capable of detecting an expressionlevel of at least one biomarker for CIN or cervical cancer, wherein theexpression level determines the CIN or cervical cancer status of thesubject.

In one aspect, a probe set comprises

-   -   one or more polynucleotides that hybridizes to a first        polynucleotide biomarker, wherein the first polynucleotide        biomarker is GRB7 (SEQ ID NOs: 8-11), NOVA1 (SEQ ID Nos: 14 and        15), RNASEH2A (SEQ ID NO: 19), or a combination thereof, and    -   one or more polynucleotides that hybridizes to a second        polynucleotide biomarker, wherein the second polynucleotide        biomarker is lnc-FANCI-2, lnc-GLB1L2-1, or a combination        thereof.

In certain aspects, the probe set is attached to a solid support, and/oreach member of the probe set comprises a detectable moiety.

One skilled in the art understands that the nucleotide sequence of thepolynucleotide probe need not be identical to its target sequence inorder to specifically hybridize thereto. The polynucleotide probes,therefore, comprise a nucleotide sequence that is at least about 65%,70%, 75%, 80%, 85%, 90%, 95%, or more identical to a region of thecoding target or non-coding target. Methods of determining sequenceidentity are known in the art and can be determined, for example, byusing the BLASTN program of the University of Wisconsin Computer Group(GCG) software or provided on the NCBI website. The nucleotide sequenceof the polynucleotide probes may exhibit variability by differing (e.g.by nucleotide substitution, including transition or transversion) atone, two, three, four or more nucleotides from the sequence of thecoding target or non-coding target.

Primers/probes based on the nucleotide sequences of target sequences canbe used in amplification of the target sequences. For use inamplification reactions such as PCR, a pair of primers can be used. Theexact composition of the primer sequences is selected so that theprimers hybridize to specific sequences of the probe set under stringentconditions, particularly under conditions of high stringency. The pairsof primers are usually chosen so as to generate an amplification productof at least about 50 nucleotides, more usually at least about 100nucleotides. Algorithms for the selection of primer sequences aregenerally known, and are available in commercial software packages.These primers may be used in standard quantitative or qualitativePCR-based assays to assess transcript expression levels of RNAs definedby the probe set. Alternatively, these primers may be used incombination with probes, such as molecular beacons in amplificationsusing real-time PCR.

The polynucleotide probes or primers can incorporate moieties useful indetection, isolation, purification, or immobilization, if desired. Suchmoieties are detectable labels, such as radioisotopes, fluorophores,chemiluminophores, enzymes, colloidal particles, and fluorescentmicroparticles, as well as antigens, antibodies, haptens,avidin/streptavidin, biotin, haptens, enzyme cofactors/substrates,enzymes, and the like. A label can optionally be attached to orincorporated into a probe or primer polynucleotide to allow detectionand/or quantitation of a target polynucleotide representing the targetsequence of interest.

In some embodiments, one or more polynucleotide probes/primers providedherein can be provided on a substrate. The substrate can comprise a widerange of material, either biological, nonbiological, organic, inorganic,or a combination of any of these. For example, the substrate may be apolymerized Langmuir Blodgett film, functionalized glass, Si, Ge, GaAs,GaP, SiO₂, SiN₄, modified silicon, or any one of a wide variety of gelsor polymers such as (poly)tetrafluoroethylene,(poly)vinylidenedifluoride, polystyrene, cross-linked polystyrene,polyacrylic, polylactic acid, polyglycolic acid, poly(lactidecoglycolide), polyanhydrides, poly(methyl methacrylate),poly(ethylene-co-vinyl acetate), polysiloxanes, polymeric silica,latexes, dextran polymers, epoxies, polycarbonates, or combinationsthereof. Conducting polymers and photoconductive materials can be used.

Substrates can be planar crystalline substrates such as silica basedsubstrates (e.g., glass, quartz, or the like), or crystalline substratesused in, e.g., the semiconductor and microprocessor industries, such assilicon, gallium arsenide, indium doped GaN and the like, and includesemiconductor nanocrystals.

The substrate can take the form of an array, a photodiode, anoptoelectronic sensor such as an optoelectronic semiconductor chip oroptoelectronic thin-film semiconductor, or a biochip. The location(s) ofprobe(s) on the substrate can be addressable; this can be done in highlydense formats, and the location(s) can be microaddressable ornanoaddressable.

The substrate can be a plate, slide, bead, pellet, disk, particle,microparticle, nanoparticle, strand, precipitate, optionally porous gel,sheets, tube, sphere, capillary, film, chip, multiwell plate or dish,optical fiber, etc. The substrate can be a form that is rigid orsemi-rigid. The substrate may contain raised or depressed regions onwhich an assay component is located. The surface of the substrate can beetched using known techniques to provide for desired surface features,for example trenches, v-grooves, mesa structures, or the like.

Surfaces on the substrate can be composed of the same material as thesubstrate or can be made from a different material, and can be coupledto the substrate by chemical or physical means. Such coupled surfacesmay be composed of any of a wide variety of materials, for example,polymers, plastics, resins, polysaccharides, silica or silica-basedmaterials, carbon, metals, inorganic glasses, membranes, or any of theabove-listed substrate materials. The surface can be opticallytransparent and can have surface Si—OH functionalities, such as thosefound on silica surfaces.

The substrate and/or its optional surface can be chosen to provideappropriate characteristics for the synthetic and/or detection methodsused. The substrate and/or surface can be transparent to allow theexposure of the substrate by light applied from multiple directions. Thesubstrate and/or its surface is generally resistant to, or is treated toresist, the conditions to which it is to be exposed in use, and can beoptionally treated to remove any resistant material after exposure tosuch conditions.

The substrate or a region thereof may be encoded so that the identity ofthe sensor located in the substrate or region being queried may bedetermined. A suitable coding scheme can be used, for example opticalcodes, RFID tags, magnetic codes, physical codes, fluorescent codes, andcombinations of codes.

The invention is further illustrated by the following non-limitingexamples.

EXAMPLES Materials and Methods

Human patient samples: Samples for RNA sequencing, containing 7 normalcervical tissues, 7 pre-cancer tissues and 7 cervical cancer tissues,and samples for validation, including 24 normal cervical tissues, 25 CIN2-3 tissues, and 23 cervical cancer tissues, were all collected from theWomen's Hospital, School of Medicine, Zhejiang University. All the humansamples were used in accordance with the Institutional Review Boardprocedures of the hospital. Informed consent was obtained from eachparticipant prior to the study. Samples were snap-frozen and stored at−80° C. until use.

RNA isolation: RNA was isolated from each human tissue sample by TRIzol®(Invitrogen, CA, USA) according to the instructions provided by themanufacturer. Total RNA quality and quantity were verifiedspectrophotometrically (NanoDrop ND-1000 spectrometer; ThermoScientific, DE, USA) and electrophoretically (Bioanalyzer 2100; AgilentTechnologies, CA, USA).

RNA sequencing and mapping: RNA-seq libraries were prepared usingTruSeq® Stranded Total RNA Sample Preparation Kit with Ribo-Zero™depletion and sequenced on an Illumina® HiSeq™-2500 platform aspaired-end reads. In brief, high-quality of human total RNA (1 μg) wasRibo-Zero™ depleted, fragmented, and then reverse transcribed. Thedouble-stranded cDNA were A-tailed and ligated with Illumia® sequencingadapters. Subsequently, the ligated products were enriched by PCR andsize-selected by agarose gel electrophoresis. The products ofapproximately 200-400-bp in size were sequenced by the Illumina®HiSeq™-2500 platform. The raw data in fastq format were mapped to thehuman reference genome (hg19, GRCh37) by Tophat v2.0.11(-g 1), which hadthe aligner Bowtie (v2.2.1.0) with the parameter settings (-N 0, -L 20,-i S,1,1.25, -n-ceil L,0,0.15 and -gbar 4). The mapping results werefurther sorted in coordination position by samtools (v0.1.19.0)(Robinson MD, Oshlack A., “A scaling normalization method fordifferential expression analysis of RNA-seq data,” Genome Biology,11:R25 (2010); Robinson MD, McCarthy DJ and Smyth GK., “edgeR: aBioconductor package for differential expression analysis of digitalgene expression data,” Bioinformatics, 26, pp. 139-140 (2010)). Thelatest annotation of LncRNA was downloaded from the publicly availablelncipedia database version 3.0. The mapped reads in individual lncRNAregion of each sample were counted by bedtools (v2.19.0). The RBioconductor edgeR package was used to normalize raw reads by thescaling method. Differentially expressed lncRNAs were identified byone-way ANOVA method with 10% false discovery rate (FDR) and four-foldchanges between the conditions. The FDR was controlled by theBenjamini-Hochberg (BH) procedure. RNA-binding protein genes werecompiled from the literature (Alfredo Castello, et al., “Insights intoRNA Biology from an Atlas of Mammalian mRNA-Binding Proteins,” Cell,149, pp. 1393-1406 (2012); Alfredo Castello, et al., “RNA-bindingproteins in Mendelian disease,” Trends in Genetics, 29, pp. 318-327(2013)). The normalized reads from the multiple transcripts of each genewere averaged to represent composite gene expression. The expressionresults were clustered using unsupervised hierarchical clusteringanalysis, in which the Euclidean Distance is used as the similaritymeasure.

Human primary keratinocytes and organotypic (raft) epithelial cultures:Total RNA extracted from various raft tissues were leftovers fromprevious studies (Wang, X. et al., “Oncogenic HPV infection interruptsthe expression of tumor-suppressive miR-34a through viral oncoproteinE6,” RNA, 15, pp. 637-647 (2009); Wang, X., et al., “microRNAs arebiomarkers of oncogenic human papillomavirus infections,” Proc. Natl.Acad. Sci. USA, 111, pp. 4262-4267 (2014)). Briefly, primary humanforeskin keratinocytes (HFK) and primary human vaginal keratinocytes(HVK) were isolated from newborn circumcision and adult vaginectomytissue specimens, respectively, as previously described (Meyers, C.,Mayer, T. J., and Ozbun, M. A., “Synthesis of infectious humanpapillomavirus type 18 in differentiating epithelium transfected withviral DNA,” J. Virol., 71, pp, 7381-7386 (1997)). Keratinocytes weregrown in monolayer culture by using epithelial (E) medium plus epidermalgrowth factor (5 ng/ml) in the presence of mitomycin C (4 μg/ml)-treatedJ2 3T3 feeder cells. Keratinocyte lines stably maintaining HPV16 andHPV18 DNA following electroporation were subcloned by limiting dilutionsof cells. Organotypic (raft) epithelial culture tissues derived fromHPV16 and HPV18-immortalized HFK or HVK were prepared as describedpreviously (McLaughlin-Drubin, M. E. and Meyers, C., “Propagation ofinfectious, high-risk HPV in organotypic “raft” culture,” Methods Mol.Med., 119, pp. 171-186 (2005)). The stratified and differentiated raftculture epidermal tissues were collected free from collagen (nofibroblasts) on day 10 and frozen on dry ice for total cell RNApreparation. Additional productive HPV18 raft cultures of HFKs wereobtained by Cre-loxP—mediated recombination as described (Wang, H. K.,Duffy, A. A., Broker, T. R., and Chow, L. T., “Robust production andpassaging of infectious HPV in squamous epithelium of primary humankeratinocytes”, Genes Dev., 23, pp. 181-194 (2009)), and the derivedraft cultures were collected on day 8, day 12, and day 16.

Plasmid pLJd-HPV-18URR-E6, pLC-HPV-18URR-E7, and pLJd-HPV-18URR-E6E7have been described (Cheng, S., Schmidt-Grimminger, D. C., Murant, T.,Broker, T. R., and Chow, L. T., “Differentiation-dependent up-regulationof the human papillomavirus E7 gene reactivates cellular DNA replicationin suprabasal differentiated keratinocytes.,” Genes Dev., 9, pp.2335-2349 (1995); Genovese, N. J., Banerjee, N. S., Broker, T. R., andChow, L. T., “Casein kinase II motif-dependent phosphorylation of humanpapillomavirus E7 protein promotes p130 degradation and S-phaseinduction in differentiated human keratinocytes,” J. Virol., 82, pp.4862-4873 (2008)). Retroviruses derived from the above vectors wereprepared as described (Banerjee, N. S., Chow, L. T., and Broker, T. R.,“Retrovirus-mediated gene transfer to analyze HPV gene regulation andprotein functions in organotypic “raft” cultures,” Methods Mol. Med.,119, pp. 187-202 (2005)). Primary HFKs were acutely infected with theretroviruses and selected with G-418 (300 μg/mL). The selected HFKs wereused to establish epithelial raft cultures and harvested on day 11.

TaqMan® real-time quantitative PCR assays: Quantitative validation ofgenes in clinical samples and raft tissues was analyzed by real-time PCRTaqMan® gene expression assays (Applied Biosystems). In brief, 2 μg oftotal RNA from each sample was reversely transcribed using Superscript®First-stand Synthesis kit (Invitrogen) according to the manufacturer'sinstructions. TaqMan® gene expression assays for RNA-binding proteingene expression were obtained from life technologies and lncRNA primersfor RT-qPCR were designed as given in Example 2.

The TaqMan® assay probes that span over exon-exon junctions weredesigned to amplify spliced RNA products to avoid detection of anycontaminated residual genomic DNA in our RNA samples. After reversetranscription, PCR products were amplified from the cDNA samples usingTaqMan® gene expression Master Mix (Applied Biosystems) together withTaqMan® gene expression assays on a StepOne Plus™ Real-Time PCR system(Applied Biosystems). Gene enrichment was calculated using the 2^(−ΔΔCt)method in relation to the housekeeping gene GAPDH. The mean Ct value ofa given gene from 24 normal cervical tissues after normalization wasserved as a basal level to calculate a relative level of the genedetected in each clinical sample. Data are presented as a bar graph withmean ±SE for each group. Significance of mRNA levels among clinicaltissue groups was analyzed using the nonparametric Mann-Whitney U-test,while significance of the mRNA levels between raft culture tissue groupswas analyzed by Student t-test.

Example 1: Identification of Altered Expression of RNA-Binding ProteinGenes in Cervical Cancer

Using RNA-sequencing (RNA-Seq) approach, seven normal cervical tissuesand seven cervical cancer tissues were examined for their expressionlandscapes of approximately 19,000 coding and 113,513 noncoding RNAs. Weidentified 614 differentially expressed coding transcripts enriched incancer related pathways and 95 of them encoding RNA-binding proteins(RBPs) from the analyzed 1502 human RBPs. Moreover, we identified 34differentially, abundantly expressed lnc-RNAs from normal cervix tocervical cancer. Table 4 shows the two RNA-Seq analyses of 14 differentclinical cervical tissues with two different RNA-seq platforms, eachcontaining normal cervical tissues without HPV infection and cervicalcancer tissues with HPV infection. The right column of the table showsthe raw reads of individual samples from each RNA-Seq platform.

TABLE 4 RNA-Seq detection from 14 cervical tissue samples Sample No. Age(yr) Pathology HPV infection Total reads RNA-Seq-1 1 27 N No 13,171,8632 38 N No 12,028,762 3 42 N No 31,143,321 4 40 SCC Yes 12,422,476 5 42SCC Yes 11,425,454 6 24 SCC Yes 22,302,605 RNA-Seq-2 7 42 N No85,255,279 8 37 N No 83,376,820 9 52 N No 80,265,055 10 44 N No81,954,460 11 48 SCC Yes 66,982,821 12 45 SCC Yes 74,819,347 13 47 SCCYes 93,579,886 14 49 SCC Yes 66,891,722

FIG. 1 is a flowchart of the RNA-Seq analyses. FIG. 2 shows Venndiagrams and FIG. 3 shows a heat map showing 95 differentially expressedRNA-binding protein genes in cervical cancer (n=7) compared to normalcervical tissues (n=7). Table 5 summarizes the 8 RBPs with expressionchanges between normal and cancer tissues by RNA-Seq. (CPM: Counts perMillion)

TABLE 5 RNA-Seq data of the 8 RBP genes between normal and cancertissues RNA-binding Normal (log₂ CPM, Cancer (log₂ CPM, protein genesDescription mean ± SD) mean ± SD) CDKN2A Cyclin-dependent kinase −0.24 ±0.88    6.3 ± 1.12 inhibitor 2A ELAVL2 ELAV like neuron-specific RNA−3.38 ± 1.89   0.17 ± 3.54 binding protein 2 GRB7 Growth factorreceptor-bound  0.9 ± 0.96 4.07 ± 1.22 protein 7 HSPB1 Heat shock 27 kDaprotein 1 5.74 ± 1.09 8.84 ± 2.49 KHSRP KH-type splicing regulatory 4.35± 0.18 5.85 ± 0.78 protein NOVA1 Neuro-oncological ventral 2.82 ± 0.55 0.1 ± 1.55 antigen 1 PTBP1 Polypyrimidine tract binding 5.74 ± 0.217.18 ± 0.83 protein 1 RNASEH2A Ribonuclease H2, subunit A 2.32 ± 0.475.01 ± 0.72

Table 6 provides the TaqMan® probe information of each RBP.

TABLE 6 TaqMan ® probe information of each RBP Company Order name Cat NoID No Applied Single Tube TaqMan ® Assay for Cat. # 4331182Hs00918009_g1 Biosystems ® GRB7 Applied Single Tube TaqMan ® Assay forCat. # 4331182 Hs00270011_m1 Biosystems ® ELAVL2 Applied Single TubeTaqMan ® Assay for Cat. # 4331182 Hs00958451_g1 Biosystems ® RNASEH2AApplied Single Tube TaqMan ® Assay for Cat. # 4351372 Hs01100863_g1Biosystems ® KHSRP Applied Single Tube TaqMan ® Assay for Cat. # 4351372Hs01103130_m1 Biosystems ® NOVA1 Applied Single Tube TaqMan ® Assay forCat. # 4351372 Hs00914687_g1 Biosystems ® PTBP1 Applied Single TubeTaqMan ® Assay for Cat. # 4331182 Hs00923894_m1 Biosystems ® CDKN2AApplied Single Tube TaqMan ® Assay for Cat. # 4331182 Hs03044127_g1Biosystems ® HSPB1

FIG. 4 shows the TaqMan® RT-qPCR validation confirming that all 8 RBPssignificantly increased (7 RBPs) or decreased (1 RBP) in cervical cancertissues (n=23), compared to normal cervical tissues (n=24). 7 increasedRBP genes in cervical cancer were also shown higher expression inpre-cancerous lesions (CIN 2-3, n=25) when compared to the normaltissues, indicating these changes appear even at the early stage ofcervical carcinogenesis. **, P<0.01; ***, P<0.001; NS, no statisticssignificance.

FIGS. 5 and 6 show that high-risk HPV16 and HPV18 infection affects theexpression of RBPs. FIG. 5 shows Total RNA extracted from human vaginalkeratinocyte (HVK)-derived raft cultures with (HVK16) or without (HVK)productive HPV16 infection and human foreskin keratinocyte (HFK) derivedraft cultures with (HFK16) or without (HFK) productive HPV16 infectionwere examined by TaqMan® RT-qPCR for the expression of 8 RBPs. *,P<0.05; **, P<0.01; ***, P<0.001; NS, no statistics significance. FIG. 6shows Total RNA extracted from human vaginal keratinocyte (HVK)-derivedraft cultures with (HVK18) or without (HVK) productive HPV18 infectionand human foreskin keratinocyte (HFK) derived raft cultures with (HFK18)or without (HFK) productive HPV18 infection were examined by TaqMan®RT-qPCR for the expression of 8 RBPs. *, P<0.05; ***, P<0.001; NS, nostatistics significance. FIG. 7 shows that both HPV16 and HPV18 increasethe expression of CDKN2A and RNASEH2A, but decrease the expression ofNOVA1 in HFK- and HVK-derived rafts. In this experiment, total RNA wasused to determine the relative levels of individual proteins by TaqMan®RT-qPCR. FIG. 8 shows that HPV18 infection and viral E6 and/or E7 affectthe expression of RNASEH2A and Nova1. The expression of RNASEH2A andNOVA1 in primary human keratinocytes (PHK)-derived raft tissues with orwithout HPV18 infection on day 8, day 12, and day 16 or PHK raftstransduced with a retrovirus expression HPV18 E6, E7 or E6E7 or with anempty control retrovirus were further validated by TaqMan® RT-qPCR.These results demonstrate that RNASEH2A and NOVA1 respond to HPV18infection and their altered expression in cervical cancer could beattributed to viral oncoprotein E6 and/or E7. *, P<0.05; ***, P<0.001;NS, no statistics significance.

FIG. 9 shows that knockdown or overexpression of RNASEH2A in HeLa orCaSki cells affects cell proliferation. Specific-siRNA knockdown orectopic expression of RNASEH2A from a mammalian expression vector inHeLa or CaSki cells on cell proliferation was evaluated by Cell CountingKit-8 (CCK-8) assay at time indicated. si-NS, non-specific siRNA;siRNASEH2A, RNASEH2A-specific siRNA; P, control vector; p-RNASEH2A,RNASEH2A-expression vector. FIG. 10 shows HPV oncoprotein E7 regulatesthe expression of RNASEH2A via E2F1. Specific-siRNA knockdown or ectopicexpression of E2F1 from a mammalian expression vector in HeLa or CaSkicells on RNASEH2A was evaluated by Western blot using anti-RNASEH2Aantibody. si-NS, non-specific siRNA; si-E2F1, E2F1-specific siRNA; P,control vector; p-E2F1, E2F1-expression vector.

Example 2: The Expression Profile of Long Noncoding RNAs DistinguishesNormal Cervix From and Cancerous Cervix

RNA was extracted from each sample using Trizol® reagent (Lifetechnologies). RNAseq libraries were prepared using TruSeq® StrandedTotal RNA Kit with Ribo-Zero depletion and sequenced on an IlluminaHiSeq™ 2000 platform as paired-end reads. The fastq data were mapped tohuman reference genome (hg19, GRCh37) by Bowtie (v2.2.1.0), and themapping results were further filtered by samtools (v0.1.19.0). Thelatest annotation of LncRNA was downloaded from lncipedia databaseversion 3.0. We counted the mapped reads in individual lncRNA region ofeach sample by bedtools (v2.19.0). The R Bioconductor edgeR package wasused to normalize raw reads by the scaling method. The differentiallyexpressed lncRNAs were detected by one-way ANOVA method with 10% falsediscovery rate (FDR) and four fold changes between the conditions. FIG.11 is a flow chart of the RNA-Seq analysis. FIG. 12 is a heat mapshowing 34 overlapped, differentially expressed lnc-RNAs in cervicalcancer compared to normal cervical tissues. lnc-FANCI-2 and lnc-GLB1L2-1were specifically identified as associated with cervical cancer. Tables2 and 3 list all of the isoforms of these two lnc-RNAs.

Taqman ® primer design for lnc-FANCI-2 Exon 6: (SEQ ID NO: 76)CTGGAAAGGAGGAGAACATGAAACATTGCTTGAAGACAATGGCCGAGACAG CAGGTCCCACCCTGCACAGCCACCAGCATCTCTC CCCTCAGCCCTGTCTCCTCTTCTGCAGTTGGGATCTGCACATTTAAGCCTGAA Exon 7: (SEQ ID NO: 77) ATTGTCCTGTGAAGTGAAGTATGATCGGACAGCCTC TTTTCAGCTTTTATG AC AATGGAGACAGAGGAATTGTGGCTCTTGCCAAGGTCACAGGATTGGAATACAGAGCCAAGCCACCCCAGGACATGCAAGAGCCTCAGAAGGGAA Primers for RT-qPCRForward: (SEQ ID NO: 78) 5′-ACAGCCACCAGCATCTCTC-3′ Probe:(SEQ ID NO: 79) 5′-TGAAGTGAAGTATGATCGGACAGCCTC-3′ Reverse:(SEQ ID NO: 80) 5′-CCACAATTCCTCTGTCTCCATT-3′ TaqMan ®primer design for lnc-GLB1L2-1: Last Exon 3: (SEQ ID NO: 81)TCTCTCATCTGTGTTTTCAGGG CATGGACTGGAACTCCCAATA CCCCTGACATGGGCTGAGTCAACGTGGTCATGAACATGTGACAGGAG Last Exon 2: (SEQ ID NO: 82)GCAGCAGAAGT TGCAGAGAAGAGTGAGGCACGTTTG AAAAAGGCTGAAA AATGTTTCTGTCCAGGCAAGG GTGTGTGCTGAATGACTCAAGGATTTTTTGG Primers for RT-qPCRForward: (SEQ ID NO: 83) 5′-CATGGACTGGAACTCCCAATA-3′ Probe:(SEQ ID NO: 84) 5′-TGCAGAGAAGAGTGAGGCACGTTTG-3′ Reverse: (SEQ ID NO: 85)5′-CCTTGCCTGGACAGAAACATT-3′

FIG. 13 shows an increase of lnc-FANCI-2, and decrease of lnc-GLB1L2-1expression along with the cervical lesion progression from normalcervix. Lnc-FANCI-2 and lnc-GLB1L2-1 RNA expression was examined byRT-qPCR in 24 normal, 25 CIN 2-3, and 23 cancer tissues. FIG. 14 showsthat HPV infection increases lnc-FANCI-2 expression in HVK- andPHK-derived rafts and viral E7 or E6 is responsible for the increase.The expression of lnc-FANCI-2 in human vaginal keratinocytes(HVK)-derived raft tissues without (HVK) or with HPV16 (HVK16) or HPV18(HVK18) infection or primary human keratinocytes (PHK)-derived rafttissues without or with HPV18 infection on day 8, day 12, and day 16 orPHK rafts transduced with a retrovirus expressing HPV18 E6, E7 or E6E7or with an empty control retrovirus were further validated by RT-qPCR.These results demonstrate that lnc-FANCI-2 expression responds to HPV18infection and viral oncoprotein E6 and/or E7.

In data not shown, lnc-FANCI-2 was upregulated in isolated keratinocytelines infected by high-risk HPVs, but not low risk HPV11 andepidermodysplasia verruciformis-associated HPVS and 10.

The term “polynucleotide” as used herein refers to a polymer of greaterthan one nucleotide in length of ribonucleic acid (RNA),deoxyribonucleic acid (DNA), hybrid RNA/DNA, modified RNA or DNA, or RNAor DNA mimetics, including peptide nucleic acids (PNAs). Thepolynucleotides may be single- or double-stranded. The term includespolynucleotides composed of naturally-occurring nucleobases, sugars, andcovalent internucleoside (backbone) linkages as well as polynucleotideshaving non-naturally-occurring portions which function similarly. Suchmodified or substituted polynucleotides are well known in the art andare referred to as “analogues.”

“Complementary” or “substantially complementary” refers to the abilityto hybridize or base pair between nucleotides or nucleic acids, such as,for instance, between a sensor peptide nucleic acid or polynucleotideand a target polynucleotide. Complementary nucleotides are, generally, Aand T (or A and U), or C and G. Two single-stranded polynucleotides orPNAs are said to be substantially complementary when the bases of onestrand, optimally aligned and compared and with appropriate insertionsor deletions, pair with at least about 80% of the bases of the otherstrand, usually at least about 90% to 95%, and more preferably fromabout 98 to 100%.

Alternatively, substantial complementarity exists when a polynucleotidemay hybridize under selective hybridization conditions to itscomplement. Typically, selective hybridization may occur when there isat least about 65% complementarity over a stretch of at least 14 to 25bases, for example at least about 75%, or at least about 90%complementarity.

The term “homologous region” refers to a region of a nucleic acid withhomology to another nucleic acid region. Whether a “homologous region”is present in a nucleic acid molecule is determined with reference toanother nucleic acid region in the same or a different molecule.

Hybridization conditions typically include salt concentrations of lessthan about 1M, more usually less than about 500 mM, for example, lessthan about 200 mM. In the case of hybridization between a peptidenucleic acid and a polynucleotide, the hybridization can be done insolutions containing little or no salt. Hybridization temperatures canbe as low as 5° C., but are typically greater than 22° C., and moretypically greater than about 30° C., for example in excess of about 37°C. Longer fragments may require higher hybridization temperatures forspecific hybridization as is known in the art. Other factors may affectthe stringency of hybridization, including base composition and lengthof the complementary strands, presence of organic solvents and extent ofbase mismatching, and the combination of parameters used is moreimportant than the absolute measure of any one alone. Otherhybridization conditions which may be controlled include buffer type andconcentration, solution pH, presence and concentration of blockingreagents to decrease background binding such as repeat sequences orblocking protein solutions, detergent type(s) and concentrations,molecules such as polymers which increase the relative concentration ofthe polynucleotides, metal ion(s) and their concentration(s),chelator(s) and their concentrations, and other conditions known in theart.

As used herein, a “probe” is a polynucleotide capable of selectivelyhybridizing to a target sequence, a complement thereof, a reversecomplement thereof, or to an RNA version of the target sequence, thecomplement thereof, or the reverse complement thereof. A probe maycomprise ribonucleotides, deoxyribonucleotides, peptide nucleic acids,and combinations thereof. A probe may optionally comprise one or morelabels. In some embodiments, a probe may be used to amplify one or bothstrands of a target sequence or an RNA form thereof, acting as a soleprimer in an amplification reaction or as a member of a set of primers.In one aspect, probes include nucleotide sequences of 10 to 1,000nucleotides. In other embodiments, the probes are 10-200, 10-30, 10-40,20-50, 40-80, 50-150, or 80-120 nucleotides in length.

The use of the terms “a” and “an” and “the” and similar referents(especially in the context of the following claims) are to be construedto cover both the singular and the plural, unless otherwise indicatedherein or clearly contradicted by context. The terms first, second etc.as used herein are not meant to denote any particular ordering, butsimply for convenience to denote a plurality of, for example, layers.The terms “comprising”, “having”, “including”, and “containing” are tobe construed as open-ended terms (i.e., meaning “including, but notlimited to”) unless otherwise noted. Recitation of ranges of values aremerely intended to serve as a shorthand method of referring individuallyto each separate value falling within the range, unless otherwiseindicated herein, and each separate value is incorporated into thespecification as if it were individually recited herein. The endpointsof all ranges are included within the range and independentlycombinable. All methods described herein can be performed in a suitableorder unless otherwise indicated herein or otherwise clearlycontradicted by context. The use of any and all examples, or exemplarylanguage (e.g., “such as”), is intended merely to better illustrate theinvention and does not pose a limitation on the scope of the inventionunless otherwise claimed. No language in the specification should beconstrued as indicating any non-claimed element as essential to thepractice of the invention as used herein.

While the invention has been described with reference to an exemplaryembodiment, it will be understood by those skilled in the art thatvarious changes may be made and equivalents may be substituted forelements thereof without departing from the scope of the invention. Inaddition, many modifications may be made to adapt a particular situationor material to the teachings of the invention without departing from theessential scope thereof. Therefore, it is intended that the inventionnot be limited to the particular embodiment disclosed as the best modecontemplated for carrying out this invention, but that the inventionwill include all embodiments falling within the scope of the appendedclaims. Any combination of the above-described elements in all possiblevariations thereof is encompassed by the invention unless otherwiseindicated herein or otherwise clearly contradicted by context.

1. A method of quantitating an expression level of a lnc-FANCI-2polynucleotide in a sample containing cells from a test patient's cervixwith one or more first polynucleotides that hybridizes to thelnc-FANCI-2 polynucleotide, the method comprising contacting the samplecontaining cells from the test patient's cervix with the one or morefirst polynucleotides, and detecting the level of hybridization of theone or more first polynucleotides to the lnc-FANCI-2polynucleotide,comparing the level of hybridization in the sample containing cells fromthe test patient's cervix to a control level of hybridization in acontrol sample of normal cervical tissues, and determining differentialexpression of the lnc-FANCI-2polynucleotidein the sample containingcells from the test patient's cervix when the level of hybridization forthe sample containing cells from the test patient's cervix is at leastabout 300% of the control level of hybridization in the control sample,wherein the one or more first polynucleotides comprises a forward primercomprising 10-40 nucleotides of SEQ ID NO: 76, a probe comprising 10-40nucleotides of SEQ ID NO: 77, and a reverse primer complementary to10-40 nucleotides of SEQ ID NO: 77, wherein the probe and the reverseprimer comprise non-overlapping sequences.
 2. The method of claim 1,wherein detecting the level of hybridization of the one or more firstpolynucleotides to the lnc-FANCI-2polynucleotide is done with real-timeRT-PCR.
 3. The method of claim 1, wherein the sample containing cellsfrom the test patient's cervix comprises a PAP smear, a vaginal wash, ora cervical biopsy sample.
 4. A method of classifying a patient as havinga normal cervix, having stage 2-3 CIN, or having cervical cancer, themethod comprising contacting a sample containing cells from the testpatient's cervix with one or more first polynucleotides that hybridizesto the lnc-FANCI-2polynucleotide, and detecting the level ofhybridization of the one or more first polynucleotides to thelnc-FANCI-2polynucleotide, comparing the level of hybridization in thesample containing cells from the test patient's cervix to a controllevel of hybridization in a control sample of normal cervical tissues,and determining a normal cervix in the subject when the level ofhybridization is comparable to the control sample of normal cervicaltissues, determining stage 2-3 CIN in the subject when the level ofhybridization is at least about 300% to at least about 700% of thecontrol level of hybridization in the control sample, and determiningstage 2-3 CIN in the subject when the level of hybridization is at leastabout 1000% of the control level of hybridization in the control sample,wherein the one or more first polynucleotides comprises a forward primercomprising 10-40 nucleotides of SEQ ID NO: 76, a probe comprising 10-40nucleotides of SEQ ID NO: 77, and a reverse primer complementary to10-40 nucleotides of SEQ ID NO: 77, wherein the probe and the reverseprimer comprise non-overlapping sequences.
 5. The method of claim 4,wherein detecting the level of hybridization of the one or more firstpolynucleotides to lnc-FANCI-2polynucleotide is done with real-timeRT-PCR.
 6. The method of claim 4, wherein the sample containing cellsfrom the test patient's cervix comprises a PAP smear, a vaginal wash, ora cervical biopsy sample.